Investigations into phonological attribute classifier representations for CRF phone recognition
نویسندگان
چکیده
Classifier combination has long been a staple for improving robustness of ASR systems; we present an experiment where introducing phonological feature scores from another lab’s system [1] into our system gives a statistically significant improvement in Conditional Random Field-based TIMIT phone recognition, despite a standalone system based on their features performing significantly worse. The second part of the paper explores the reasons for this improvement by examining different representations of phonological attribute classifiers, in terms of what they are classifying (binary versus n-ary features) and representation of scoring functions. The analysis leads to the conclusions that while binary phonological feature estimates usually are worse than n-ary features, the combination of the two can be quite good if there are also differences in the feature definitions or training paradigm.
منابع مشابه
Joint Versus Independent Phonological Feature Models within CRF Phone Recognition
We compare the effect of joint modeling of phonological features to independent feature detectors in a Conditional Random Fields framework. Joint modeling of features is achieved by deriving phonological feature posteriors from the posterior probabilities of the phonemes. We find that joint modeling provides superior performance to the independent models on the TIMIT phone recognition task. We ...
متن کاملFace Recognition in Thermal Images based on Sparse Classifier
Despite recent advances in face recognition systems, they suffer from serious problems because of the extensive types of changes in human face (changes like light, glasses, head tilt, different emotional modes). Each one of these factors can significantly reduce the face recognition accuracy. Several methods have been proposed by researchers to overcome these problems. Nonetheless, in recent ye...
متن کاملInvestigations into the Crandem Approach to Word Recognition
We suggest improvements to a previously proposed framework for integrating Conditional Random Fields and Hidden Markov Models, dubbed a Crandem system (2009). The previous authors’ work suggested that local label posteriors derived from the CRF were too low-entropy for use in word-level automatic speech recognition. As an alternative to the log posterior representation used in their system, we ...
متن کاملRepresenting Phonological Features Through a Two-Level Finite State Model
Articulatory information has demonstrated to be useful to improve phone recognition performance in ASR systems, being the use of Neural Networks the most successful method to detect articulatory gestures from the speech signal. On the other hand, Stochastic Finite State Automata (SFSA) have been effectively used in many speech-input natural language tasks. In this work SFSA are used to represen...
متن کاملA Study on the Use of Conditional Random Fields for Automatic Speech Recognition
Current state of the art systems for Automatic Speech Recognition (ASR) use statistical modeling techniques such as Hidden Markov Models (HMMs) and Gaussian Mixture Models (GMMs) to recognize spoken language. These techniques make use of statistics derived from the acoustic frequencies of the speech signal. In recent years, interest has been rising in the use of phonological features derived fr...
متن کامل